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Abstract 

We construct a soft thresholding operation for rank reduction of hierarchical 
tensors and subsequently consider its use in iterative thresholding methods, in par¬ 
ticular for the solution of discretized high-dimensional elliptic problems. The pro¬ 
posed method for the latter case automatically adjusts the thresholding parameters, 
based only on bounds on the spectrum of the operator, such that the arising tensor 
ranks of the resulting iterates remain quasi-optimal with respect to the algebraic 
or exponential-type decay of the hierarchical singular values of the true solution. 
In addition, we give a modified algorithm using inexactly evaluated residuals that 
retains these features. The practical efficiency of the scheme is demonstrated in 
numerical experiments. 


1 Introduction 

Subspace-based tensor representations such as the hierarchical Tucker format |17| or the 
special case of the tensor train format enable the numerical treatment of problems 
in very high dimensions by exploiting particular low-rank structures. Here one has a 
well-defined notion of tensor rank as a tuple of matrix ranks of certain matricizations. 
From a computational perspective, these tensor formats have the major advantage that 
for any tensor given in such a representation, by a simple combination of linear algebraic 
procedures, one may obtain an error-controlled, quasi-optimal approximation by a tensor 
of lower ranks; this can be achieved by truncating the ranks of a hierarchical singular 

2^ , or HSVD for short, of the tensor. 

In this work, we consider an alternative procedure for reducing ranks that is based 
on soft thresholding of the singular values in a HSVD, as opposed to the mentioned rank 
truncation (which would correspond to their hard thresholding). The new procedure has 
similar complexity and quasi-optimality properties, but unlike the truncation it is non- 
expansive, which turns out to be a major advantage in the context of iterative schemes. 

A large part of the results that we obtain are in fact applicable to fixed point iterations 
based on general contractive mappings. The iterative scheme that we focus on as an 
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example, however, is a Richardson iteration for solving the linear equation 

= f 

on a separable tensor product Hilbert space 


( 1 ) 


n = rLi 


® T-Ld , 


where f G is given and A'.T-L^T-L 'is elliptic on that is, the iteration is of the form 


Ufc+i = (Ufc - n{Auk - f)) 


( 2 ) 


where Sq is the proposed soft thresholding procedure with suitable parameters ak- 

Even when both A and f have exact low-rank representations, the unique solution 
u* of the problem Q may no longer be of low rank. It turns out, however, that in many 
cases of interest, u* can still be efficiently approximated by low-rank tensors up to any 
given error tolerance. Here one can obtain algebraic error decay with respect to the ranks 
under fairly general conditions |20[[^ , and superalgebraic or exponential-type decay in 
more specific situations [8 13 19|. 

When a solution u* that has this property is approximated by an iteration such 
as Q, it is not a priori clear to what extent also the iterates retain comparably low 
ranks, since the basic iteration without truncation can in principle lead to an exponential 
rank increase. That the ranks of u^. remain comparable to those needed for approximat¬ 
ing u* at the current accuracy therefore depends crucially on the appropriate choice of 
thresholding parameters a^- Keeping the tensor ranks of iterates as low as possible is of 
crucial importance for the efficiency of such methods, since the complexity of the tensor 
operations that need to be performed grows like the fourth power of these ranks. 

We show in this work that when the rank reduction in each step is done by the tensor 
soft thresholding procedure that we shall describe, quasi-optimal tensor ranks can be 
enforced for each single iterate u^, irrespective of the rank increase caused by A. This 
means that, assuming that the hierarchical singular values of u* have either algebraic 
or exponential-type decay, the tensor ranks of each can be bounded up to a uniform 
constant by the maximum rank of the best hierarchical tensor approximation to u* of 
the same accuracy. To this end, we exploit the non-expansiveness of soft thresholding, 
which allows us to choose the thresholding parameters in each step as large as required 
to control the ranks, without compromising the convergence of the iteration. 

After describing the construction of the operation Sq,^ , we begin by identifiying choices 
of geometrically decreasing that lead to the desired rank estimates provided that 
the precise decay of the hierarchical singular values of u* is known. On this basis, we 
then construct a scheme which, based on a certain a posteriori criterion, adjusts 
to the unknown decay of hierarchical singular values such that the quasi-optimal ranks 
are preserved. This method requires no a priori information except for bounds on the 
spectrum of A and the norm of f. In a third step, we develop a perturbed version of the 
scheme that permits inexactly evaluated residuals. 

For the case that the rank reduction is done by a truncated HSVD (that is, by hard 
thresholding), a scheme for choosing thresholding parameters that leads to near-optimal 
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ranks is given in [^[^; to the authors’ knowledge, this is the only previous instance of a 
method that guarantees global converge to the true solution while at the same time, the 
arising ranks can be estimated in terms of the ranks required for the approximation of 
the solution. A limitation of the approach used there to control the ranks is that their 
near-optimality is enforced by truncating with a sufficiently high tolerance, which can be 
done only after every few iterations when a certain error reduction has been achieved. 
The ranks of intermediate iterates can therefore still accumulate in the iterations between 
these complexity reductions, which can be problematic if each application of the operator 
already causes a large rank increase. In the method proposed here, this accumulation 
can be ruled out, and intermittent, sufficiently large increases of approximation errors 
that restore quasi-optimality are no longer necessary. 

Note that here we do not address the aspect of an adaptive underlying discretization of 
the problem as considered in . Although our resulting procedure of the form Q can 
in principle be formulated on infinite-dimensional Hilbert spaces, in this work we restrict 
our considerations concerning a numerically realizable version to fixed discretizations. In 
other words, in the form given here, the scheme applies either to infinite-dimensional T-Li ~ 
I' 2 (N) (which is of course not implementable in practice), or to a fixed finite-dimensional 
choice T-Li — The version of the algorithm allowing inexact evaluation of residuals 
can, however, serve as a starting point for combining the method with adaptive concepts 
for identifying suitable subsets of ■^ 2 (N) in the course of the iteration. Furthermore, we 
expect that the concepts put forward here can also be used in the construction of adaptive 
methods for sparse basis representations. 

Iterations using soft thresholding of sequences have been studied extensively in the 
context of inverse and ill-posed problems, see e.g. [^|^|^, where they are especially well 
suited for obtaining convergence under very general conditions. Note that in such a 
setting, a priori choices of geometrically decreasing thresholding parameters have been 
proposed, e.g., in and [^. However, our approach for controlling the complexity of 
iterates - in the present case, the arising tensor ranks - in iterative schemes for well- 
posed problems appears to be new, in particular the a posteriori criterion that steers the 
decrease of the thresholding parameters. 

The proposed method can also be motivated by a variational formulation of the 
problem. For instance, if A is symmetric, the solution is characterized by 

u* = argminj- (^v, v) — (f , v)| . 

A standard way to solve this problem in the spirit of a Ritz-Galerkin method would 
be to restrict it to the manifold of hierarchical tensors with fixed rank and treat the 
resulting minimization problem by Riemannian gradient methods or alternating least 
squares techniques (l^ . However, the sets over which one needs to minimize are not 
convex, and there generally exist many local minima. Roughly speaking, in such methods 
one fixes the model class (in this case by the admissible hierarchical tensor ranks) and 
attempts to minimize the error over this class. 

In an alternative variational formulation, one can prescribe an error tolerance, for 
instance ||^v — f || < e, and attempt to minimize the tensor ranks over the set of such v. 
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Although the admissible set is then convex, even in the matrix case d = 2 the rank does 
not give a convex functional. However, one can instead minimize an appropriate convex 
relaxation such as the £i-norm of singular values. It is well known that in the matrix 
case, such relaxed problems can be solved by proximal gradient methods, which can be 
rewritten as iterative soft thresholding |22] and hence take precisely the form (|^ when 
d = 2. In this case, our method can therefore also be motivated as a rank minimization 
scheme, although this connection does not play a role in the analysis. Note, however, 
that in the case of higher-order tensors, where our soft thresholding procedure no longer 
permits an interpretation of the resulting scheme as a proximal gradient method, this is 
only a formal analogy. 

This article is arranged as follows: in Section we collect some prerequisites con¬ 
cerning the hierarchical tensor format as well as soft thresholding of sequences and of 
Hilbert-Schmidt operators. In Section we then describe and analyze the new soft 
thresholding procedure for hierarchical tensors. In Section we consider the combi¬ 
nation of this procedure with general contractive fixed point iterations and derive rank 
estimates for sequences of thresholding parameters that are chosen based on a priori in¬ 
formation on the tensor approximability of u*. In Section we introduce an algorithm 
that automatically determines a suitable choice of thresholding parameters without using 
information on u*, analyze its convergence, and additionally give a modified version of 
the scheme based on inexact residual evaluations. In Sectionwe conclude with numer¬ 
ical experiments on a discretized high-dimensional Poisson problem that illustrate the 
practical performance of the proposed method. 

Throughout this paper, the notation A < B is used to indicate that there exists a 
constant C > 0 such that A < CB. 

2 Preliminaries 

By II'll, we always denote either the canonical norm on T-L, which is the product of the 
norms on the Tit, or the .^ 2 -iiorm when applied to a sequence. 

To quantify the sparsity of sequences, we shall frequently use membership in weak-ip 
spaces, which are defined as follows: For a given sequence a = {ak)k£n, for each n G N 
let a* be the n-th largest of the values |afc|. Then for p > 0, the space ip^oo is defined as 
the collection of sequences for which 

1 

l«l^,oo := supnpoj; 
nSN 

is finite, and this quantity defines a (quasi-)norm on ip^oo- We will use these spaces with 
p <2, which implies (.p^oo C (. 2 ] note that one always has ip C ip^oo C ip' for all p' > p. 

For separable Hilbert spaces 5i, ^ 2 , we write HS(5i,52) for the space of Hilbert- 
Schmidt operators from 5i to S 2 with the Hilbert-Schmidt norm II'UhS) which reduces to 
the Frobenius norm in the case of finite-dimensional spaces. Hilbert-Schmidt operators 
have a singular value decomposition with singular values in ^ 2 , satisfying the following 
perturbation estimate, cf. |23]. 
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Theorem 2.1. Let S 2 he separable Hilbert spaces, let X,X G HS(5i,52), and let 
cr,d' € ■^ 2 (N) denote the corresponding sequences of singular values. Then ||cj — 5'||£2(N) — 

||X-X||hs. 

Note that this was shown for matrices in (^ , but the proof immediately carries over 
to Hilbert-Schmidt operators. 

2.1 The Hierarchical Tensor Format 

We now briefly recall dehnitions and facts concerning the hierarchical Tucker format |17] 
and collect some basic observations that will play a role later. For further details on the 
hierarchical format, we refer to [I^ . 

Throughout this work, we assume d >2. Let T be a binary dimension tree for tensor 
order d, that is, with root element {1,... ,d}; examples for d = 4 are given in Figure 
We adopt the terminology of [Id] , referring to the collections of basis vectors in the 
leaves of the tree as mode frames, and to the coefficient tensors at interior nodes of the 
tree as transfer tensors. 

We shall later make use of a certain equivalence between dimension trees, which we 
formulate in terms of the edges in these trees. For each node n G T, we set [n] := 
{1,..., d} \ n. In general, [n] ^ T. Let 

E := {{n, [n]} : n G T \ {1,..., d}} . 

Then the elements of E correspond precisely to the edges in the tree T, where the root 
element {1,..., d} is regarded as part of an edge. We set E := ffE = 2d — 3. 

For a given set of edges, there are several dimension trees that correspond to the same 
matricizations of the tensor, but have the root element of the tree at a different edge. 
This is illustrated in Figure for a tensor of order four. Moving the root element in 
the tensor representation can be done in practice by basic linear algebra manipulations, 
where the existing component tensors in the representation are simply relabelled and 
reorthogonalized accordingly. For instance, in passing from the hrst to the second tree 
in Figure]^ the transfer tensor for node {2,3,4} is relabelled to {1,3,4}. 

For what follows, we always assume a hxed enumeration {ret, [ret]}, t = 1,..., F, of E. 
Note that the efficiency of the algorithms we will describe may depend on this sequence. 
For practical purposes, it should be chosen such that moving the root element from one 
edge to the next in the enumeration takes as little work as possible, as for instance in 
Figure]^ For each t, we denote by Aft(u) the matricization corresponding to the t-th 
edge of the tensor u, which for inhnite-dimensional TL is a Hilbert-Schmidt operator 

Mtiu): (g) ^ (g)Hi, 

*S[nt] i&nt 

and by we denote the mapping that converts a matricization back to a tensor. Note 
that for each t, one has Af 4 (u) -|- A4i(v) = A4t(u -|- v) and 

||u|| = ||Mi(u)||HS- (3) 
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{ 1 , 2 , 3 , 4 } 

{1} {2,3,4} 

{2'^^ 4} 

{3O4} 


{1,2, 3,4} 

{2} {1,3,4} 

{{^^4} 

{ 3 O 4 } 


{1,2, 3, 4} 


{1,2,3,4} 
{1,2,4} {3} 

{1,^^^4} 

{ 1 O 2 } 


{1,2} {3,4} 

{1O2} {3'04} 

{1,2, 3, 4} 

{1,2.3} {4} 

{l.^^^S} 

{ 1 O 2 } 


Figure 1: Examples of equivalent dimension trees obtained by moving the root element. 


The sequence of singular values of this matricization is denoted by (Jt(u), which we always 
assume to be defined on N (with extension by zero in the case of finite-dimensional T-Li), 
and we set 


rankt(u) := rank(>lt(u)) = #{fe G N: fTt,fc(u) / 0} . 


2.2 Soft Thresholding 


For X G M, soft thresholding with parameter a > 0 is defined by 

Sa{x) := sgn(x) max{|x| — a, 0} . 

In comparison, hard thresholding is given by ha{x) := (l — X[_a,o](a;))4:. 

Applied to each element of a vector or sequence, hard thresholding provides a very 
natural means of obtaining sparse approximations by dropping entries of small absolute 
value, which is closely related to best n-term approximation [To] , In contrast, soft thresh¬ 
olding not only replaces entries that have absolute value below the threshold by zero, 
but also decreases all remaining entries, incurring an additional error. However, this 
operation has a non-expansiveness property that is useful in the construction of iterative 
schemes, and that can be derived from a variational characterization. 

To describe this characterization, for a proper, closed convex functional J' : Q ^ M. 


on a Hilbert space Q and a > 0, following 24 we define the proximity operator prox^ : 

Q^Qhy 

proxT(u) := argmin< aj{w) -|- -||u — v||c } . 

vee 2 J 


(4) 


As shown in |24| , such operators have the following general property, which plays a crucial 
role in this work. 

Lemma 2.2. The proximity operator prox^ is non-expansive, that is, 

||prox(}(u) - prox^(v)||g < ||u - v||g , U, V G ^ . 


6 



Soft thresholding of a sequence v by applying Sa to each entry, Sq,(v) := ('Sa(i'i)) 
can be characterized as the proximity operator of the functional see e.g. [^, and is 
therefore a non-expansive mapping on £ 2 - 

An analogous characterization is still possible when soft thresholding is applied to the 
singular values of matrices or operators, which provides a reduction to lower matrix ranks. 
More precisely, the soft thresholding operation Sa for matrices is defined as follows: for 
a given matrix X with singular value decomposition 

X = Udiag(ai(X))V'^, 

where crj(X) denotes the z-th singular value of X, we set 

5„(X) :=Udiag(s„(a,(X)))V'^. 

Note that application of the hard thresholding ha to the singular values would instead 
correspond to a rank truncation of the singular value decomposition. For Hilbert-Schmidt 
operators X one can define ^^(X) analogously. 

The mapping Sa is the proximity operator for the nuclear norm, 

||X||, := ||a(X)||,, = J^ai(X). 

i>l 

Lemma 2.3. Let 81,82 he separable Hilbert spaces, X G HS(5i,52), and a > 0. Then 


proxy, li 


(X) = 5„(X) 


or in other words, 


Sa{l^) = argmin< a|| V| 
veHS 


+ -IIX-V 


|2 

Ihs 


}■ 


Note that this statement is shown for finite matrices X, e.g., in j^; the generalization 
to Hilbert-Schmidt operators can be obtained by finite-dimensional approximation, using 
Theorem 2.1 and that the £i-norm is lower semicontinuous with respect to componentwise 
convergence of sequences, which follows from Fatou’s lemma. 


3 Soft Thresholding of Hierarchical Tensors 

In this section, we construct a non-expansive soft thresholding operation for the rank 
reduction of hierarchical tensors. By St^a ■ TL ^ TL we denote soft thresholding applied 
to the matricization of the input, 

St,a{u) := oSaO Mt{u.). 

The complete soft shrinkage operator Sa ■ TL ^ TL is then given as the successive appli¬ 
cation of this operation to each matricization, that is, 

Sa(u) := S'e.q: o ... o S’i^Q,(u) . (5) 
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Clearly, the result of applying Sq, for a > 0 always has finite (but not a priori fixed) 
hierarchical ranks. 

For a hierarchical tensor u with suitably numbered edges t = the soft 

thresholding Sq(u) can be obtained as follows: starting with uq = u, for each t, we first 
rearrange Uj_i such that the root element is on edge t with a singular value decomposi¬ 
tion of Afi(ut_i), which exposes the singular values (Tt(ut_i) and thus allows the direct 
application of St^a to obtain ;= and finally Sq(u) = ug. An example of 

an order in which the edges in E can be visited for this procedure in the case d = 4 is 
given in Figure 


Remark 3.1. Assuming that we are given a hierarchical tensor with dim'l^j < n G N 
and representation ranks bounded by r, then the first step of bringing this tensor into an 
initial HSVD representation takes 0{dr‘^ + dr'^n) operations. Moving the root element 
from one edge to an adjacent one costs 0{r^ + r'^n) operations; if an appropriate ordering 
of the edges is used, the total complexity for applying Sq can thus be seen to be bounded 
by 0{dr^ + dr'^n) as well. 


Proposition 3.2. For any u, v G and a > 0, the operator Sq, defined in ([^ satisfies 
||Sq(u) - Sq(v)|| < ||u - v||. 


Proof. The statement follows by repeated application of Lemmata 2.2 2.3 and □ 


The following lemma guarantees that applying soft thresholding to a certain ma- 
tricization of a tensor does not increase the hierarchical singular values of any other 
matricization of this tensor. 


Lemma 3.3. For any v € T-L and for t, s = 1,..., E, one has crt^i{u) > crtA^sA^)) for 
a// i G N and any a > 0. 

Proof. Note that for the action of Ss,a, the tensor is rearranged such that the edge s 
holds the root element. Thus the statement follows exactly as in part 3 of the proof of 
Theorem 11.61 in [^, see also the proof of Theorem 7.18 in (^ ; there it is shown that 
when singular values are decreased at the root element, this cannot cause any singular 
value of any other matricization to increase. □ 

Using the above lemma, the error incurred by application of Sq to a tensor u can be 
estimated in terms of the sequences of hierarchical singular values crt(u), t = 1,..., E. 


Lemma 3.4. For any u £ Ti and r G No, let 


Tt^rA) ■— ~ '"'’ll = 

rankt(w)<r 




1 

| 2 \ 2 


k>r 


Furthermore, for any 6 > 0 we define 

rt^sA) ■— max{r G N: at^r > d} U {0} . 
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Then for any given a > 0, 


max df{u.) < ||Sa(u) — u|| < (6) 

’ ’ t=l 

where, with Tt,a{'^) := 

df (u) := ||(Tt(S't,„(u)) - fTt(u)|| = ^Ja^rt,a{vh) + (Tt,a(u))^ . 

Remark 3.5. It can be seen that the upper estimate in (|^ is generally sharp by choosing 
u as a tensor of rank one (that is, with all hierarchical ranks equal to one) with ||u|| = Ea. 
In this case, Sq,(u) = 0. 

Proof of Lemma 3-4 We first show the second inequality in ([^. Let 

vi := u, vt := o ... o 5pa,(u), t>2. 

By a telescoping sum argument, applying the soft thresholding error estimate to each 
individual application of Sa,t, we obtain 

E E 

||S„(u) -u|| < ^||5't,„(vi) - Vtll < 


t=i 


t=i 


It remains to show that df{vt) < df{u). This follows from Lemma 3.3, whose repeated 
application gives crt^i{u) > (Tt_i(5i^Q,(u)) > (Jt^i{S 2 ,a ° <S'i,q(u)) > ... for each t. 

To show the first inequality in ([^, we again invoke Lemma 3.3, in this case to infer 
that for each t, 

^|(Jt,i(S„(u)) - fTt,i(u)|2 > - cTt,i(u)p = (df (u))^ . 

i>l i>l 

Moreover, by Theorem |2.1[ we have 

||(Tt(Sa(u)) - atiu)\\ < ||Alt(Sa(u)) - Ali(u)||HS = l|Sa(u) - u|| , 

and taking the maximum over t concludes the proof. 


Remark 3.6. Let u £ TL, then cTt(u) G £ 2 , which implies that d“(u) —>■ 0 as a 
Without further assumptions, however, this convergence can be arbitrarily slow. 

(a) If in addition (Tt(u) G ip^oo for a p G (0, 2) and for each t, we have 


□ 

0 . 






a 


rt,c 


< \atiu)\f 


see llOl, and thus 


|So(u) - u|| < E max|o-i(u)|^'^^ 
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(b) If cJt,fc(u) < Ce , A: G N, with C, c, /3 > 0, then arguing similarly as in 
7.4], we obtain 

n,a < (c“^ ln(C'a“^))? < (1 + |lna|)/3 , < (1 + |lna|)^ a, 

and therefore 

||S«(u) - u|| < .E(l + llnal)^ a. (7) 

Remark 3.7. It is well known that soft thresholding is closely related to convex optimiza¬ 
tion by proximal operator techniques. Note that the soft thresholding for hierarchical 
tensors can be written as 


10 


Section 


Sq — 


o Spa = proxj 


o proxj^ 


( 8 ) 


with Jt := ||Alt(-)||* for t = 1,... ,E. Thus in our setting, we do not have a characterisa¬ 
tion of Sq, by a single convex optimisation problem (as provided for Sa by Lemma 2.3), 
but still by a nested sequence of convex optimization problems: one has Sa(u) = u^;, 
where 


ut := argmm 
ven 


m|jt( 


v) -F - v| 

2q; 




with uq := u. 


4 Fixed-Point Iterations with Soft Thresholding 

In this section, we consider the combination of Sq, with an arbitrary convergent fixed 
point iteration with a contractive mapping E: Ti ^ Ti, that is, there exists p G (0,1) 
such that 

II/"(v) — T'(w)|| < p||v — w|| , v,wG?7. (9) 

In the example of a linear operator equation = f with elliptic A, we may choose 
E{v) = V — p,{Av — f) with a suitable scaling parameter p > 0. A practical scheme for 
this particular case will be considered in detail in Section 

Since Sq is non-expansive, the mapping Sq o T" still yields a convergent fixed point 
iteration, but with a modified fixed point. 

Lemma 4.1. Assuming ([^, let u* be the unique fixed point of E. Then for any a > 0, 
there exists a uniquely determined u" such that u“ = Sq(J^(u“)), which satisfies 

(1 -h p)“^||Sq(u*) - u*|| < ||u" - u*|| < (1 - /9 )“^||Sq(u*) - u*|| . (10) 

Moreover, for any given uq, for := SQ(J-'(ufc)) one has 

IN. ..“II 

||Ufc — U II < /? ||Uo — u II . 
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Proof. By the non-expansiveness of S^, the operator ^ := Sq o is a contraction. The 
existence and uniqueness of u“, as well as the stated properties of the iteration, thus 
follow from the Banach fixed point theorem. Let e" := Q{vP) — ^(u*). Then since 
Q{vP‘) = u“, one has 

u“-u* = g(u*)-u*+e“. 

Combining this with the observation 

||e“|| = ||g(u“) - gK)|| < ||^(u“) - ^(u*)|| < p||u“ - u*|| , 


where we have again used that Sq, is non-expansive, yields 

||u“-u*|| < ||g(u*)-u*||+p||u“-u*||, ||g(u*)-u*|| < ||u“-u*||+p||u“-u* 


Finally, noting that ^(u*) = Sq,(u*) since = u*, we arrive at (10). 


□ 


Theorem 4.1 tells us that if we keep the thresholding parameter a fixed, the thresh- 
olded Richardson iteration will converge, at the same rate p as the unperturbed Richard¬ 
son iteration, to a modified solution u“. Its distance to the true solution u* is uniformly 
proportional to ||Sci(u*) — u*||, that is, to the error of thresholding the exact solution. 


4.1 A Priori Choice of Thresholding Parameters 


In order to ensure convergence to u*, instead of working with a fixed a, we will instead 
consider the iteration 

Ufc+i = S„^(T'(ufc)) , (11) 

where we choose with —>■ 0. The central question is now how one can obtain a 

suitable such choice; clearly, if the ak decrease too slowly, this will hamper the conver¬ 
gence of the iteration, whereas ak that decrease too quickly may lead to very large tensor 
ranks of the iterates. 

In principle, if the decay of the sequences crf(u*) is known, for instance o't(u*) G ^p,oo) 
then Remark 3.6 immediately gives us a choice of values for ak that ensure convergence 
to u* with almost the unperturbed rate p. To this end, observe that 

||Ufc+i - u*|| < ||Ufc+i - 


< p\\uk - U*|| -F 


-k ||u"'' - U*|| < p\\uk - u*|| -F (1 -F p)||u"'= - u* 

1-p^ 


d“'=(u*), 


( 12 ) 


and based on Remark 3.6 we can adjust ak in every step so as to balance the decrease 
in the two terms on the right hand side of (12). Choices of ak in (11) for the respective 
cases in Remark 3.6 are given in the following proposition. 

Proposition 4.2. Let crt(u*) G ip,oo, t = for a p ^ (0,2), let cq > 0, and let 

2 

uq := 0. Then for the ehoice ak ■= {p^^^Co)^-p in the iteration ( |11[ ), we have 

||ufc — u*|| < (||u*|| -I- C'i?max|cri(u*)|^^^ k) p^ , (13) 
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where C depends on p, p, and cq. Furthermore, for any p > p, with ak 
we have 


I * 11 ^ 

|Ufc - u II < 


u 


+ 


CE maxt \ at (u*) |^^^ p 


p-p 


:= (p^+^co)2-p, 

(14) 


Under the same assumptions, but with the stronger eondition at^ki'^* ) < Ce with 
C, c,/3 > 0, for the choice ak ■= p^~^^co we have 


Ufc - u 


<Ek^+^p^. 


and with ak ■= p^~^^co, we have instead 


l|ufc - u*|| < Ek^^ / , 

where the constant depends on {p — p)~^■ 


(15) 


Proof. Using (12) and Remark |3.6[ a), we obtain 

||ufc-u*|| < /||u*|| +Ci^max|cJi(u*)||^^^/-*-i 


k-l 


a. 


1-2 
2 


2 = 0 


Note that with our choice of ak, we have a- ^ which implies the hrst statement. 


For the second choice of ak, we obtain instead 


Ufc — u*|| < (^^l|u*ll + Ci£'max|(7t(u*)||^ ^cq 6*^ ^ * 


fc-i 


i=0 


P 


Under the second set of assumptions, we proceed analogously based on Remark |3.6[ b), 
which for ak '■= p^^^cq yields 


k-l 

Ilufc - U*|| < p^ + F; J^p"-*-i(l + llncol + i\\np\)hp^+^ . 

j=0 

where the modihed power of k in the assertion thus arises due to the logarithmic term 
in (Q. For ak '■= p^~^^cq, with 6 as above, 

k-l 

l|uA:-u*|| < (e^+ EY,0^-^-\l + i\\np\)^p)p^+^ <Ek^^p\ □ 

i=0 

In summary, in this idealized setting with full knowledge of the decay of ||So(u*) —u*|| 
with respect to a, we can achieve convergence of the iteration with any asymptotic rate 
p> p. 
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4.2 Rank Estimates 


We now give estimates for the ranks of the iterates that can arise in the course of the 


iteration, assuming that the are chosen as in Proposition 4.2 


For the proof we will use the following lemma, which is a direct adaptation of 
Lemma 5.1], where the same argument was applied to hard thresholding of sequences; it 
was restated for soft thresholding of sequences (with the same proof) in [^. 

Lemma 4.3. Let v, w G H and e > 0 such that || v — w|| < e, and for f G {1, ..., E} let 
crt{v) G ip^oo for a p € (0, 2). Then 


Ae 


rankt(S„(w)) < -^ + 


-p 


U o't,k{^) < Ce for /c G N with C,c, (3 > 0, then 

4e" 




rankt(Sa(w)) < —^ + (c ^ ln(2C'a ^)) 


Proof Note first that as an immediate consequence of Theorem 2.1 for each t we have 


1 

|o-f(v) - cri(w)|| = < ||Afi(v) - A4i(w)||HS 


= V — w < e . 


Furthermore, Lemma 3.3 yields ranki(SQ(w)) < rankt(S't^Q,(w)), and it thus suffices to 
estimate the latter. 

The first inequality in the statement now follows with the same argument as in [^, 
which we include for the convenience of the reader. We abbreviate a := ert{v), b := (Ti(w). 
Let Xi := {i'. bi> a,ai > q:/ 2} and X 2 := {i- bi> a,ai < q;/ 2}. Then 


ON 2 
2 


#X2 < ^|aj - bi\‘^ < 
i€l2 


as well as 


#Xi < #{T ai > a/2} < Cp\at{-v)f/ a ^, 


which proves the first statement, since rankt(5qo(w)) < ffXi + ffT 2 - To obtain the 
second inequality, we use Remark 3.6'b) to estimate ffTi in an analogous way. □ 

Theorem 4.4. Let p > p, \iq := 0, and Sk '■= p^■ If o' 4 (u*) G ip^oo, t = 1,..., E, for a 

p G (0,2), then for the choice ak ■= cq) with cq > 0 m the iteration , we 

have 


lufc - u*|| < dek , max rank* (ufc) < d ", s = -- I. 


(16) 


Under the same assumptions, but with at^ki'^* ) < Ce , t = 1,E, with C,c,/3> 0, 
for the choice ak '■= p^~^^cq we have 


Ufc - 


U*|| <d(l + |logefc|)2/3£;., max rankt(ufc) < d^(l + llogefcl)^ • (17) 
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Proof. Recall that E = 2d — 3. The estimates ||ufc — u*|| < Ee^ were shown in (14) 
and (15) of Proposition 4.2 To obtain the corresponding estimates for the ranks, we 
use Lemma 4.3 Note that with := T'(ufc_i), we have = SQ^_^(wfc) and, by 


contractivity of E, 

||wfc - U*|| < /9||ufc-l - U*|| < E£k-l 
In the first case, for each t, Lemma [4.3| gives 

{Esk)'^ ^ - 


rankt(Saj^_j(wfc)) 


< 




2pk 


2 ^ 2k 2pfc 


a 


k-l 
1 


Noting that (p ’^-p) = p ^ we obtain the first assertion. In the second case 

we have Eek/o-k < i?(l + |logefc|) , and the lemma thus yields 

rankt(Sa(wfc)) < E‘^[1 + |logeA:|)^ + (l + |logp^co|)^ < E'^[1 + |logefc|)? . □ 


5 A Posteriori Choice of Parameters 


The results of the previous section lead to the question whether the results in Theorem 


not available. We thus now consider a modified scheme that adjusts the automatically 
without using such information on u*, but still yields quasi-optimal rank estimates as in 
Theorem 14.41 for both cases considered there. 

The design of such a method is more problem-specific than the general considerations 
in the previous section, and here we thus restrict ourselves to linear operator equations 
= f with A symmetric and elliptic. We assume to have 7 , T > 0 such that 


4.4 can still be recovered when a priori knowledge of the decay of the sequences (Tt(u*) is 


7 l|v|p < (^v, v) < r| 


V gT-L . 


(18) 


that is, the spectrum of A is contained in [ 7 , T] and k := 7 ^T is an upper bound for the 
condition number of A. The choice p := 2/(7 -|- T) then yields 


||Id-M|| < 

K -|- 1 

and the results of the previous section apply with E{v) := v — p^Av — f). 


(19) 


In order to be able to obtain estimates for the ranks of iterates as in Theorem 4.4 


the method in Algorithm is constructed such that whenever Ok is decreased in the 
iteration, 

||ufc+i - u*|| < C'||u"'“ - u*|| (20) 

holds with some fixed constant C > 1. It will be established in what follows that the 
validity of such an estimate ensures that Ok never becomes too small in relation to the 
corresponding current error ||ufc — u*||. A bound of the form (20) is ensured by the 


condition in line of Algorithm which is explained in more detail in the proof of 
Theorem 15.11 below. 
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Algorithm 1 = STSolve(A, f; e) 


parameters: 7 , F as in (18), fj,, p as in ( |19[ ), arbitrary u,9 ^ (0,1) and oq > S“^//||f| 
output: Ug satisfying ||u£ — u*|| < e. 

1: uo := 0, ro := —f, k := 0 
2 : while ||rfc|| > ye 
3: Ufc+I = (ufc - prk) 

4: Ffc+i := Auk+l - f 


5: 

6 : 

7 

8 
9 

10 

11 

12 


if ||ufc+i - Ufcll < ^^^ ||rfc+i|| then 


a/c+i := Oak 

else 

«fc+i:= Ofc 
end if 
k <— k + 1 
end while 

Ue := Ufc 


rp 


Note that Algorithm only requires - besides a hierarchical tensor representation of f 
and the action of A on such representations - bounds 7 , F on the spectrum of A, certain 
quantities derived from these, as well as constants that can be adjusted arbitrarily. The 
following is the main result of this work. 

Theorem 5.1. Algorithm^ produces with ||u£ — u*|| < e in finitely many steps. 
Furthermore, if (Tt(u*) G ip^oo, t = 1,... ,E, for a p G {0, 2), then there exists p G (0,1) 
such that with Ek ■= p^, the iterates satisfy 

1 _ 1 

l|ufc-u*|| < defc , ^^ax^rankt(ufc) < max^|cr.r(u*)|/^_^£fc " , s = i-i.( 21 ) 

If < Ce~^^^, t = 1,... ,E, with C,c,f5 > 0 , then the analogous statement holds 

with ^ 

l|ufc - u*|| < defc , max rankt(ufc) < d^(l + |lnefc|) ^ . ( 22 ) 

In the proof we will use the following technical lemma, which limits the decay of the 
soft thresholding error as the thresholding parameter is decreased. 

Lemma 5.2. Let v / 0, then df (v) < d“^df“(v), t = 1,..., E, for all a > 0, 0 G (0,1). 

Proof. For the proof, we omit the dependence of quantities on v. We clearly have > 

rt^a and Tt^a < n,a- Furthermore, “ '^t,ea ^ - n,a), and consequently, 

( dfy _ + a‘^{rp9a - rpa) ^ ^ 

Vdf"y {Oafirt^ea + ~ {0a)‘^rtfia + Tt,ea 
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(23) 


Proof of Theorem \5.1\ Step 1: We first show that the condition 

ii„ .. II / 0 -- II 

— -n h^/c+1 

in line of the algorithm is always satisfied after a finite number of steps. 

The combination of the first inequality of ([^ in Lemma 3.4 and the first inequality 
in (10) of Lemma 4.1 shows 


(1 + p)-^ maxdf (u*) < ||u“ - u* 


Thus we always have ||u" — u* || > 0 if a > 0, unless u* = 0 and hence f = 0. In the latter 
case, however, the algorithm stops immediately, and we can thus assume that u“ ^ u* 
for any positive a. 

If Qifc = ... = ao) we have on the one hand 


||ufc+i - Ufcll < ||ufc+i - u"°|| + ||ufc - u"0|| < /(I +p)||uo - u“0|| , (24) 

and on the other hand, we similarly obtain 

7"^l|rfc+i|| > Ilufc+i - u*|| > ||u"o - u*|| - /+^||uo - u"o|| 

> (1 - /+^)||u“o - u*|| - /+^||uo - u*|| . (25) 


Thus the right hand side in (24) converges to zero, whereas the right hand side in (25) 
is bounded away from zero for sufficiently large k. Hence (23) holds with = Jo — 1 
for some Jq G N, and we assume this to be the minimum integer with this property. 
The thresholding parameter is then decreased for the following iteration, that is, ojq = 
9aj„-i = 9ao. As in (12), for k < Jq we obtain 


Ufc+I - u*|| < /+^||uo - U*|| + {1 + p‘ 




u’ 


ao 


— U 


(26) 


The same arguments then apply with oq replaced by aj^ and Uq by uj^. Thus will 
always be decreased after a finite number of steps. 

Step 2: To show convergence of the u^, we first observe that by our requirement 
that ao > Jl“^;u||f|| (which is in fact not essential for the execution of the iteration), we 
actually have ui = 0 and hence u"° = 0, implying also Jq = 1. In particular. 


|u„ - U*|| < llu* - u"°| 


0 < n < Jo = 1 


(27) 


We next investigate the implications of the condition (23) for the further iterates. Note 
first that 


|Ufc+l - u 


Ufc+I - u"'=|| < ||u* - U' 


I 


(28) 


The standard error estimate for contractive mappings, combined with (23) and (18), 
gives 

||ufc+i - u"*!! < —^||ufc+i - Ufcll < ^||rfc+i|| < z^||ufc+i - u*|| . (29) 

I- p T 
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Inserting the latter into (28), we thns obtain 


||ufc+i - u*|| < (1 - I/) ^(||ufc+i - u*|| - ||ufc+i - u“'=||) < (1 - zy) ^||u* - u"'=|| . (30) 

We introdnce the following notation that gronps iterates according to the correspond¬ 
ing values of for each i G No, let r]i := 0*ao- With uo,o = uq = 0 and ro,o = —f, 
iterates are produced according to 


w, 


j+l •— f, (3f) 


where the index i is increased each time that condition (23) is satisfied. For each i, 
consistently with the previous definition of Jo, we define J* as the last index of an iterate 
produced with the value 77 ,, which means that Uj+i^o = ^i,Ji and, as a consequence of 


|ui+po - u*|| = ||uj,j. - u*|| < (1 - ly 


I ^llu* — I 


i > 0. 


(32) 


For i > 0 and j = 0,..., Jj, with (32) we obtain 

||ujj - U*|| < ||ujj - u^*|| -h llu’’* - u* 


< fP\\vLifl - u*|| -h (1 -h /0^)||u^‘ - U*|| 

< (1 - z^)-V'l|u'’‘-' - u*|| + (1 + P^')||u'’‘ - u* 


(33) 


where we have used (27) in the case i = 0. By Remark 3.6 this implies in particular that, 
in our original notation, —)• u*. 

Step 3: Our next aim is to estimate the values of Ji. We have already established 
that Jo = 1. In order to estimate Ji for i > 0, we use (24) and (25) to obtain 

-h/0)||Ui,o -u^*|| 


u 


*j+i 


— u 


I 


< 


I 


— U*|| — /j'+^IIUj^o ~ 


(34) 


for j sufficiently large. Thus (|23|) follows if the two conditions 

< 

'i — U*|| 

hold. These are guaranteed if 


||u’?*—u*|| 2rp llu'?* —u*|| 


< 


j > |lnp| {C{j,T,iy)+\n 


|w,o - 
Ilu^* — U* II 


(35) 


with some constant C( 7 ,r,z^) > 0. By (32), 


In 


|uj,0 - 


and by Lemma [3.4|and Lemma [4.1[ 


< ln( 1 -|- (1 — 17 ) 


-1 


U' 


Vi-i _ u* 


u^i - U* 


U' 


Vi-1 _ u* 


— U* 


< 


(1 + p)l|s,,,i(u*) -u 


, {l + p) ^ df-^^%u*) ^ il + p)E 


(l-/5)l|S^i(u*)-U*|| (1 -P) ^ df“°(u*) ( 1 -/ 5 )^’ 


17 


























where we have used that as a consequence of Lemma 5.2, the quotients df “°(u* 


remain uniformly bounded by 0 Putting this together with (35), we thus obtain 


Ji ^ ln(d), 


with a uniform constant depending on 7 , P, u, and 9. In view of (33), this implies that 


Ufc converges to u* at a linear rate in cases ( 21 ) and ( 22 ). 


Step 4- In order to establish rank estimates, we need to bound the errors of Wj j as 


dehned in (31) for each i > 0 and 0 < j < Jj. Since uq = = 0 by our choice of ao) 

(36) 


for i = 0 we obtain 

||woj - U*|| < p||uo 7 _i - U*|| = p||u^° - u*|| , j = Jo = 1, 
and for i > 0 and j > 0 , by 


w. 


ij - U*|| < /0||Ujj_i - u*|| < (1 - 0 - U*|| + p{l + p> 




u " — u 


(37) 


By Lemma 4.3, with M := maxt|(Tt(u*)|£pfor all t we have 

, * 112 


Tankt{uij) < -^ 

m 


fMPp.P, incase(pTl), 

2 + fiVi), fidi) ■= ^ 


|(1 + |ln 7 j|)/ 3 , in case ( 22 ). 


Note that this also covers Uj^o for i > 0, since Uj^o = for * > 0 and uq^o = 0. 


In case (21), as a consequence of (27), (33), (36), (37), with Remark |3.6[a) we obtain 




for all respective i,j, and consequently 

rankt(uij) < = (1 + E‘^)MPr]~^ . 

P 1-2 p 1-2 

By the same argument, we also have Hujj — u* || < EM^. Setting Sij := M ‘2 rj- ^, 
we thus have Hujj — u* II < Esi. J as well as 


ranki(ui 7 ) < (1 + E‘^)Mse. 


i,j ’ 


1 1 


This completes the proof of (21). In the case (22), Remark 3.6(b) yields, expanding 
r]i = e^ao, 

||uj7-u*||, ||wij-u*|| < j;(l + illn^D^e*, 

and hence 

rankt(ui7) < E‘^{1 + i|ln0|)^ + (1 + ^|ln0|)? . 

We choose 6 G {9, 1) and set Sij := 0® to obtain Hujj — u*|| < Esij and 

ranki(ui7) < f 1 + |^^|ln0®A < E"^ {l + \lneij\)^ . 


iin^r 


This completes the proof of (22). 


□ 
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Remark 5.3. The above algorithm is universal in the sense that it does not require 
knowledge of the decay of the (Tt(u*), but we still obtain the same quasi-optimal rank 


estimates as with ak prescribed a priori as in Theorem 4.4 Note that in (22), we have 
absorbed the additional logarithmic factor that is present in the error bound in (fT 


by 

comparing to a slighly slower rate of linear convergence, but the estimates are essentially 
of the same type. 


Remark 5.4. For the effective convergence rate p in the statement of Theorem 5.1 


as can 


be seen from the proof (in particular from the estimates for the Jj), one has an estimate 

1 

from above of the form p < < 1, where p does not explicitly depend on d (although 

it may still depend on d through other quantities such as 7, T). Consequently, combining 


this with the statements in (21) and (22), we generally have to expect that the number 
of iterations required to ensure ||ufc — u* || < e scales like |log p\~^ ((log e + log d) log d ). 


5.1 Inexact Evaluation of Residuals 

We finally consider a perturbed version of Algorithm where residuals are no longer 
evaluated exactly, but only up to a certain relative error. We assume that for each given 
V and J > 0, we can produce r such that ||r — (Av — f)|| < d. 

We will show below that for our purposes it suffices to ensure a certain relative error 
for each computed in Algorithm more precisely, to adjust 5 for each k such that 

||rfc - (Aufc - f)|| < min{ri||rA.||,r2/U“^||ufc+i - Ufc||} , 

with suitable ti,T 2 > 0. This can be achieved by simply decreasing the value of 6 and 
recomputing (and the resulting u^+i) until 5 < min{ri||rfc||,r2p“^||ufc_|_i — Ufc||} is 
satisfied. With such a choice of <5, we then have in particular 

(1 - ri)||rfc|| < IIAufc - f|| < (1 + ri)||rfc|| . (38) 

Our scheme can be regarded as an extension of the residual evaluation strategy used 
in |12| in the context of an adaptive wavelet scheme, where the residual error is controlled 
relative to the norm of the computed residual. Note that in our algorithm, the error 
tolerance 5k used for each computed is adjusted twice: first in line to ensure the 
accuracy with respect to ||rfc||, and possibly a second time in line[^ (after incrementing 
k) to ensure the accuracy with respect to ||ufc+i — u/j||. 

The analysis of the resulting modified Algorithm [^follows the same lines as the proof 
of Theorem |5.H and we obtain the same statements with modified constants. We do not 
restate the full proof, but instead indicate how the central estimates are modified. For a 
given iterate u^, we now denote the exact residual by := — f and the computed 

residual by r^. 

We first consider the infiuence of the perturbation on the iteration without thresh¬ 
olding, but with inexact residual, for which we obtain 

\\{uk - pvk) - u*|| < \\{uk - pfk) - U*|| + p\\rk - Ffcll < p||ufc - u*|| +Tip\\rk\\ . 
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(39) 


Using ||r/^|| < (1 — ri) ^\\vk\\ ^ (1 ~ u) ^r||ufc — u*|| and /xF < 2, 


(Ufc - /iFfc) - u II < p + 


2ti 

1 - Ti 


|Ufc - U 


which yields a contraction provided that ti < (3 — — p). However, in (36) and 

( |37[ ), where the bound (39) is required, n < 1 is in fact sufficient. 

We now turn to the contractivity of the iteration with thresholding. Note that 

||ufc+i - Ufcll < ||ufc+i - Saki^k - /^ffc)|| + IlSaJufc - - u"'“|| + Hu"'' - Ufc|| , (40) 

where by non-expansiveness of Sq^., 

||ufc+i - Sa^{\ik - fifk)\\ < /r||rfc - ffcll , 


and thus, since /u||rfc — r^H < T2||ufc_|_i — Ufc|| by our construction, (40) gives 


I II ^ ^ P II Ofc I 

Ufc +1 - Ufc < -- Ufc - u *= 

1 - r2 


As a consequence. 


|ufc+i - u"''|| < ||SQ,j^(ufc - jurk) - + ||Saj^(ufc - /rrfc) - Sajufc - fj.fk)\\ 

< p\\uk -u“'“|| + p\\rk -rfcll 

(1 + p)T2 


< p(t 2 )||ua: - u“'“|| , p{t 2 ):=p + 


1 — T2 


(41) 


where p{t 2 ) < 1 holds precisely when T2 < 1(1 — p); in other words, the perturbed hxed 
point iteration then has the same contractivity property with a modihed constant. 
Furthermore, we show next that the validity of the modihed condition 


|ufc+i - Ufell < 5||rfc+i|| , B := 


(l-p)(l-ri)z/ 


(1 + r2)(p + (1 - r2) i(l + p)r2)r’ 


(42) 


in Algorithm still implies that the corresponding iterates satisfy 

To this end, as in the proof of Theorem 5.1, it suffices to show that (42) implies 
||ufc_|_i — u“''|| < z^||ufc_|_i — u*||. On the one hand, by the construction of and the 
standard error estimate for hxed point iterations, we have 

||ufc+i - u"*!! < p(r2)||ufc - u"'=|| < ^^||S«Jufc - fifk) - Ufcll 

^p(r 2 )(l + r2) 

^ ^- Ufc+l — Ufc , 

1- p 

and on the other hand, by the construction of r^+i, 

||ufc+i - u*|| > r“^||rfc+i|| > (1 - ri)r"^||rfc+i|| . 
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Algorithm 2 = InexactSTSolve(A, f; e) 


parameters: fi, p as in ([I^ , arbitrary G (0,1), uq > E ^/r||f| 

Ti G (0,1), T 2 G (0, 2(1 - p)), B as in 

< e. 


output: Ue satisfying 


2' 

|U£ - U* 


, D as in (43). 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 


uo := 0, ro := -f 
k := 0, (5o := n||ro|| 
while ||rfc|| + > ye 

Ufc+l • (ufc p^k) 

while 6k > r2p~^||ufc_|_i — Ufc|| A 6k > Ii*||rfc|| with D as in (43) 

6k 

compute Tk such that ||rfc — (Aufc — f)|| < <5^ 

Ufc+i ^ (ufc - prk) 

end while 

^k+l '■= ^~^6k 
repeat 

6k+i ^ ^6k+i 

compute Ffc+i such that ||rfc_|_i — (Au^+i — f)|| < 6k+i 
if ||rfc+i|| + 4+1 < ye then 
set Uj := Ufc+i and stop 
end if 

until 4+1 < ri||rfc+i|| 

if ||ufc_|_i — Ufcll < i?||rfc_|_i|| with B as in (42), then 
Ofc+i := dak 

4+1 Ti ||r/c+i II 

else 

otk+i '■= oik 
end if 
A: A: + 1 

end while 

Ue := Ufc 


Combining these two estimates, we find that (42) implies 


With this implication 


and the modified estimates (39) and (41), one can now follow the proof of Theorem 5.1 


to obtain the same statements. 

There are two additional checks in the algorithm to ensure that the 4 cannot become 
arbitrarily small. On the one hand, when the condition in line 14 of Algorithm is 
satisfied, then ||Aufc_|_i — f|| < ye, which implies ||ufc+i — u*|| < e, and we can therefore 
stop the iteration. 

On the other hand, if the loop in line exits because the second condition with the 
constant 


D := min 


pi'T2{l - Tl)" 


(1 - Ti)t2B _ 

(1 + ri + TB)p ’ (p(l + ri)(l + T2) + iy{l - ti)(1 - p))p 


(43) 
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is violated, that is, if 


4 < D\\rk\ 


(44) 


then the condition ||ufc_|_i — u"*"!! < z/||ufc_|_i — u*|| as in (29) is satisfied and condition 
(42) in line 18 is gnaranteed to hold, which means that will be decreased. To see this, 
note first that 


|rfc+i|| > (1 + n) ^((1 - n)||rfc|| - r||ufc+i - u^H), 


(45) 


and since the first condition in line|^still holds, we have ||ufc+i—Ufc|| < fiT 2 ^4- Therefore 
(44) implies in particnlar 

(1 + (1 + ri)“^rS)||ufc+i - Ufcll < + - Ti)||rfc|| , 


which combined with ( |45[) i mplies (42). Fnrthermore, (44) also yields, with the second 
nimnm in (|43[), the estimate 

(r2'V + 1^)4 < - ri)(l + ri)“^((l - ri)||rfc|| - ■ (46) 


case in the minimnm in (|4^, the estimate 
P 


I- p 


Since i/||ufc+i — u*|| > > uT ^(1 — Ti)||rfc_|_iII, by ( |45[ ) the right hand side in 

(46) can be estimated from above by z^||ufc+i — u*||. For the left hand side, we have 

P 


1 - p 


{t 2 V + At)4 > “^^11 + PWT^k -rfcl 


> 


I- p 


ISaJufc - prk) - Ufcll > ||Ufc+i - u' 


«fc I 


and from (46) altogether we obtain ||ufc_|_i — u“'“|| < i^||ufc+i — u*|| as reqnired. 

Note that as a conseqnence of this constrnction, the 4 obtained in Algorithm 
remain proportional to ||rfc|| dnring the iteration. 


6 Numerical Experiments 

In principle. Algorithms and can be applied to qnite general discretized elliptic 
problems, since only bonnds on the spectrnm of the discrete operator A are reqnired. 
For onr nnmerical tests, we choose a particnlar setting where we have a snitable method 
for preconditioning with explicit control of the resnlting condition nnmbers available: we 
test Algorithmic on a discretized Poisson problem with homogeneons Dirichlet bonndary 
conditions 

-Au = f on (0,1)'^, 

nsing similar techniqnes as for the adaptive treatment in with the difference that we 
now nse a wavelet Galerkin discretization with the basis functions chosen in advance. 

We shall now briefly describe how the discrete operator A is obtained as a symmetri¬ 
cally preconditioned Galerkin discretization in a tensor prodnct wavelet basis. Starting 
from an orthonormal basis of snfhciently regnlar (mnlti-)wavelets of A^(0,1), 
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we obtain a tensor product orthonormal basis {^u}u£\^d, of L^((0,1)'^) with 'E'l, := 
ipu-i <8> • • • ® such that the rescaled basis functions v G V'^, where 

'■= 11^1(0,1) + ••• + II’ 

form a Riesz basis of Hq{{0, 1)'^). We now pick a fixed finite subset Ai C V and set 
A := Ai X • • • X Ai C V'^. Furthermore, we use the family of low-rank approximate 
diagonal scaling operators S“^, n G N, constructed in we choose a 5 G (0,1) and 
then take h according to Theorem 4.1] such that 

(1 - 5)||diag(a;“^)v|| < ||S^V|| < (1-k 5)||diag(a;“^)v|| 

for all sequences v supported on A. With 


d 

i=l 




Ajt'G A 


fA :=((/,'k.)) 


z/GA • 


we then set 


A := S^^TaS 


-1 


f := SrifA 


Thus u* = S.TX^fA, where the additional scaling by Sfi yields convergence of the scheme 
in Ff^-norm at a controlled rate. An approximation of which in turn is a Galerkin 

approximation of the sequence of L^-coefhcients {u, 'if^) of the true solution, can then be 
recovered by applying to the computed u^. For A, one can obtain accurate bounds 
for 7, F, and in particular. 




A,j/eAi 


In our numerical tests, as in we take / = 1 and use the piecewise polynomial, contin¬ 
uously differentiable orthonormal multiwavelets of order 7 constructed in [Tl] . The uni¬ 
variate index set Ai comprises all multiwavelet basis functions on levels 0,..., 4, which 
yields #(Ai) = 224. The unspecified constants in Algorithm are chosen as 6 := |, 
io i' := ao := 5/u||f||, and we take 5 := 

Note that since the resulting diagonal scalings consist of 10 separable terms, 
a naive direct application of A could increase the hierarchical ranks of a given input 
V by a factor of up to 200; the observed ranks required for accurately approximating 
^v, however, are much lower. Therefore we use the recompression strategy described 
in Section 7.2] for an approximate evaluation of with prescribed tolerance in order 
to avoid unnecessarily large ranks in intermediate quantities. In this setting, the inexact 
residual evaluation in Algorithm is thus of crucial practical importance. 

We compare the computed solutions to a very accurate reference solution of th e 
discrete problem obtained by an exponential sum approximation Uq Ia, see 


13 


15 


The error to the reference solution is computed as err^ := ||diag(a;i/)(Sj^^Ufc —Uo)||, which 
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rfcll eiTfc/lIrfcll au 


Figure 2: From left to right (versus iteration number fc): computed discrete residual 
norms ||rfc||, ratios of differences to reference solution err^ to ||rfc||, corresponding thresh¬ 
olding parameters each for d = 16 (light grey), d = 32 (dark grey), d = 64 (black). 



Figure 3: Solid lines: maximum and minimum hierarchical ranks of iterates, i.e., 
min^ rankt(ufc) and maxt rankt(ufc); dotted lines: maximum ranks of computed resid¬ 
uals, maxt rankt(rfc); each versus k. 


is proportional to the error in dd^-norm of the corresponding represented functions. The 
quantity err^ thus serves as a substitute for the difference in the relevant norm of to 
the exact solution of the discretized problem. 

The numerical results for d = 16, d = 32, and d = 64 are shown in Figures 
and It can be observed in Figure that the norm of the solution of the problem 
decreases slightly with increasing d; apart from this, the iteration behaves very similarly 
for the different values of d, producing in particular a monotonic decrease of discrete 
residual norms. As expected, these values also remain uniformly proportional, up to 
very moderate constants, to the dd^-difference to the reference solution. The values ak 
can be seen to hrst decrease in every step as long as = 0; subsequently, they decrease 
in a regular manner after an essentially constant number of iterations. As one would also 
expect, the hnal value of needs to be slightly smaller for larger d. 

Figure 1^ shows the maximum and minimum hierarchical ranks of the computed it¬ 
erates (whose difference grows slightly with increasing d) compared to the ranks of the 
corresponding computed discrete residuals r^, clearly demonstrating the reduced rank 
increase relative to that we obtain by the approximate residual evaluation. The addi- 
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tional variation in the residual ranks is a consequence of the fact that the the differences 
||ufc_|_i — Ufcll decrease as long as ak remains constant, enforcing a more accurate residual 
evaluation. As soon as the thresholding parameter changes, the accuracy requirement is 
subsequently relaxed again by line 20 in Algorithm]^ since the values ||ufc_|_i — 


are 


again increased when u"'" changes. Note furthermore that the ranks show little variation 
with increasing d, which is substantially more favorable than the quadratic increase with 


d that is possible in the estimates (21) and (22) of Theorem 5.1 


7 Conclusion 


We have constructed an iterative scheme for solving linear elliptic operator equations in 
hierarchical tensor representations. This method guarantees linear convergence to the 
solution u* as well as quasi-optimality of the tensor ranks of all iterates, and is universal 
in the sense that no a priori knowledge on the tensor approximability of u* is required. 

However, if it is known that the hierarchical singular values of u* have, for instance. 


exponential-type decay, then Theorem 4.4 shows that one can obtain the same properties 
by a priori prescribing Uk that decrease geometrically at some rate p > p. In such a 
setting, this simpler approach with a priori choice may thus be a viable alternative. 

Since the given a priori choices of thresholding parameters work for quite general 
contractive fixed point mappings, the construction of schemes that make this choice a 
posteriori may be possible for more general cases than the linear elliptic one treated here. 
In this regard, note that although we have always assumed for ease of presentation that 
the considered operator A is also symmetric, this is not essential. 

In this work, we have considered hxed discretized problems, but we expect that the 
basic strategy proposed here can also be used in the context of adaptive discretizations. 
Moreover, there may exist other related soft thresholding procedures for tensors than the 
sequential approach underlying our construction that retain the required features. 
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